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In the original Evolutionary Minority Game, a segregation into two populations with opposing 
preferences is observed under many circumstances. We show that this segregation becomes more 
pronounced and more robust if the dynamics are changed slightly, such that strategies with above- 
average htness become more frequent. Similar effects occur also for a generalization of the EMG 
to more than two choices, and for evolutionary dynamics of a different stochastic strategy for the 
Minority Game. 
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I. INTRODUCTION 

The Minority Game (MG) was introduced by Challet 
and Zhang in 1997 [1] as a model for the competition for 
limited resources. Although it has since been studied in 
more than 100 publications [2], and countless variations 
have been introduced, the basic scenario is still easy to 
explain: there is a population of N players who, at each 
time step t, have to make a decision er* e { — 1,1}. Those 
who are in the minority win, the others lose (to avoid am- 
biguities, N is chosen to be odd). Direct communication 
and contracts among players are not allowed; however, 
the decision of the minority is public information, and 
players can base their decision on a finite number M of 
past decisions. 

Global efficiency is measured by the standard deviation 
of the sum of individual decision, 



(1) 



Random guessing by all players leads to a 2 — N; a 
smaller value indicates good coordination among players, 
a larger value is a sign of herd behavior. 

One obvious feature of this game is that there is no 
unique best action for the players - if it existed, it would 
be the same for all players for symmetry reasons, and all 
players would lose. 

From the point of view of economic game theory [3], 
the game has a large number of Nash equilibria - com- 
binations of strategies where no player can improve his 
chances of winning by unilaterally changing his strat- 
egy. For example, if (N + l)/2 players choose +1 all 
the time, and (N — l)/2 pick — 1, the global loss is mini- 
mal (a 2 = 1); however, those who are on the losing side 
stay there forever, and a player who switches sides will 
cause the majority to flip, and continue losing. Simple 
combinatorics show that there are (( N N iy 2 ) such combi- 
nations. 

Furthermore, there are even more Nash equilibria in 
mixed strategies: e.g., if all players pick +1 with a prob- 
ability of 0.5 and —1 otherwise, no single player who 



develops a preference for one option will have an advan- 
tage from this. However, if all players continue guessing, 
one gets a 2 = N, as pointed out before; a better coor- 
dination would be desirable. A vast continuum of mixed 
strategies exists where no outcome is preferred - all of 
these are Nash equilibria. 

In the absence of a unique best way to proceed, play- 
ers have no choice but to adapt their strategy to their 
environment, i.e., the behavior of their co-players. The 
MG has become a testing ground for various forms of 
"bounded rationality", i.e., more or less simplistic deci- 
sion and learning algorithms for the agents, ranging from 
a choice between a small number of Boolean functions 
[1, 4, 5] and simple neural networks [6] to evolutionary 
algorithms [7]. 

This paper presents new aspects of the Evolutionary 
Minority Game (EMG) studied by Johnson et al. in [7], 
and introduces an evolutionary variation of the stochastic 
strategy described in [8]. Two central questions are: 1.) 
what are the consequences if a player who has "died" in 
the evolutionary process is replaced by a modified copy 
of a different player, rather than picking a strategy at 
random? 2.) Can the prescription be generalized to more 
than two choices - 1, . . . , Q instead of ±1, as suggested 
in Ref. [9]? Let us start with a look at the original 
evolutionary MG. 



II. THE ORIGINAL EVOLUTIONARY MG 

In its original formulation [7], the EMG works as fol- 
lows: each player has access to a table which records, 
for each possible combination of M consecutive minority 
decisions, what the minority decision following the last 
occurrence of that combination was. Players have only 
two individual features, namely a score Sj and a proba- 
bility pi. With this probability y>,-, they choose the action 
in the history table corresponding to the current history; 
otherwise they choose the opposite action. 

Players who win gain a point on their score, whereas 
the others lose a point. If the score of a player drops 
below a certain threshold — d < 0, the player is replaced 
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by a new one with a reset score Si = and a probability 
Pi that is either a modified copy of his predecessor's value 
or chosen entirely at random. 

As was pointed out before [10], this scheme can be 
simplified by exploiting the fact that the entries in the 
history table are decoupled, and that there is complete 
symmetry between the actions +1 and —1. The simplest 
interpretation that gives the same stationary distribution 
P{j>) would therefore be that each player picks +1 with 
probability pi and —1 otherwise. This points to an anal- 
ogy with classical game theory [11]: the Minority Game 
is an ./V-player negative-sum matrix game, and the pi de- 
fine the mixed strategies of each player. 

In the original EMG, a deceased player is replaced by 
a new player whose strategy pi is a modified version of 
his predecessor's value: a random number A with a given 
variance V is added to the previous value, with reflecting 
or cyclic boundary conditions at p — and p = 1. It 
turns out that neither the exact value of the threshold 
d nor the number N of players play a significant role, 
and that the typical size of mutations changes the re- 
sults quantitatively, but not qualitatively: the stationary 
probability distribution Pip) develops two peaks at p = 
and p = 1, while there is still a significant probability for 
intermediate values of p. 

First attempts to calculate the probability distribu- 
tion analytically were only moderately accurate [12, 13]: 
they assumed that the reason for the self-organized seg- 
regation was only in the self-interaction of agents, and 
none of the two choices was systematically preferred. As 
newer studies [14, 15] have shown, this is not true: most 
of the time, there is a significant preference for one of 
the two options, and players who prefer this option have 
higher losses and a higher chance to be replaced. The 
preference for one side undergoes rather regular oscil- 
lations, with accompanying oscillations of the scores of 
players with one or the other preference. (The presence of 
these oscillations also means that the distribution Pip) is 
time-dependent, and becomes stationary only when aver- 
aged over a long time compared to the oscillation period. 
Whenever we speak of a stationary distribution from now 
on, we mean it in that sense.) 

These publications also reported that, if a winning 
player receives R < 1 points rather than 1 point, the 
segregation into extreme opinions vanishes below a cer- 
tain point R c and is replaced by a preference for unde- 
cided agents with p « 0.5. This is a rather remarkable 
result: after all, the aim of the game is unaffected by 
the modified payoff R - it is still advantageous to be in 
the minority, the chances of winning still only depend 
on the set of {pi}, and the optimal configuration still has 
(TV — 1)/2 players on one side and (A^ + l)/2 on the other. 
What has changed is the dynamics of the game, and it 
is these dynamics that prevent the system from finding 
a more advantageous state. 

The crucial point about the evolutionary dynamics 
that have been considered so far is this: they do not sys- 
tematically favor mutations with a higher fitness - they 



are not compatible in the sense of Ref. [16]: ". . . for any 
dynamic compatible with a properly defined fitness func- 
tion fitter strategies should increase relative to less fit 
strategies." We will demonstrate this in the limit of in- 
finitely large mutations, which amounts to the same as 
choosing a new strategy completely at random. Simula- 
tions indicate that the results apply to small mutations 
as well. 

Let us take the fitness f(p) associated with a strategy 
p to be the negative of the probability of a player using 
p of dying in a given round (in previous studies, this was 
assumed to be the average gain divided by the threshold). 
The strategics arc distributed following a probability den- 
sity Pip), with a cumulative probability distribution 
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The expected number of players with Pi < p who mu- 
tate in a given round is J? —f(p')Pip')dp'. Out of the 
replacements for these players, a fraction of p has a strat- 
egy pi < p. The updated probability function is therefore 



C(p,t + 1) = C(p,t)+ / fip')Pip')dp' 
Jo 
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Going to continuous time and differentiating with respect 
to p, the integro-differential equation for P(p) looks as 
follows: 



dPjp) 
dt 



P(p)f(p) - f Pip')fip'W = P(p)f(p) - 1 
Jo 

(4) 

Keeping in mind that the fitness always takes negative 
values, Eq. (4) means that if Pip) is small enough, it 
increases even if the fitness associated with is is below 
average - this is clearly not what is desired. 

The problem can be remedied (in principle) by a small 
change in the dynamics: a player is replaced not by a 
copy of himself, but by a copy of another player. This 
makes sense in various interpretations: in an economic 
situation, "dying" could have the meaning of "going 
broke" , and a player who tries a new start wants to imi- 
tate one of his (apparently more successful) competitors. 
In a biological setting, an organism literally dies, and an 
offspring of another organism takes its place. With this 
new mechanism, Eq. (3) takes the form 



C(p,t + 1) = Cip,t)+ / fip')Pip')dp' - 
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which leads to the dynamic 
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This describes a so-called Malthus process, where the fre- 
quency of strategies with above-average fitness increases 
exponentially. The problem with this dynamic is that 
once a strategy becomes extinct, there is no way of re- 
viving it. In the limit of infinitely many players that 
was tacitly assumed above, this is not a problem, since 
the probabilities for a strategy never go to in finite 
time. However, with a finite number of players imitating 
each other, this would eventually lead to a small num- 
ber of sub-populations, each of which exclusively plays 
one of the mixed strategies that happened to survive the 
initial stage (this scenario resembles a variation of the 
Backgammon model [17]). 

To get a well-defined final state independent of initial 
conditions, it is therefore necessary to add a small muta- 
tion to the copied strategy to explore unoccupied areas 
in strategy space - for example by adding a Gaussian 
of variance V <C 1 to p, with reflecting boundary condi- 
tions. If this is done, a stationary probability distribution 
emerges that is strongly peaked at and 1 and vanishes 
for intermediate p, as seen in Fig. 1. Global losses are re- 
duced dramatically: for example, for V = 10~ 4 , one gets 
a 2 « 0.021iV instead of a 2 « 0.317V for the original EMG 
for large d. Coordination can be improved even more by 
decreasing V, at the cost of a longer equilibration time. 
Furthermore, the results of decreasing the reward R for 
winning, as suggested in Ref. [14], are less dramatic: 
decreasing R does not destroy segregation; however, the 
peaks at extreme strategies become wider, and global ef- 
ficiency decreases somewhat. 
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FIG. 1: Stationary distribution P(p) of the EMG with imita- 
tion, compared to the EMG without imitation with large mu- 
tations. Parameters were N = 501, d = 10, R = 1, V = 10~ 4 . 



III. MULTI-CHOICE EMG 

Generalizations of the MG to more than two options 
were studied in [9] (with agents using neural networks) 
and [18, 19] (with agents using a set of decision tables). 
The basic idea is simple: 

• each player now picks an action (or "room") a\ e 



1, 



, Q out of Q options; 



• the number of players N q who chose each option is 
determined: N q — 8 auq ; 

• the option chosen by the fewest players (the "least 
crowded room", with occupation N m i n ) is declared 
winner (a coin toss decides in case of a tie); 

• the players who chose the winning room receive an 
award (let us say, one point), whereas the others 
lose 1/(Q — 1) points. 

Global efficiency can be measured by taking the analog 
of Eq. (1) either for the occupation of the winning room: 



(7) 




or the occupation of all rooms: 
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In many cases, the latter quantity differs from cr„ Lin only 
by a constant factor and is easier to calculate [9] . For the 
reference case of random guessing, <7q takes the value of 
N/Q. 

The generalization of the evolutionary MG to multi- 
ple choices leaves several options. We choose the one 
that yields a standard multi-player matrix game: each 
player is equipped with a strategy vector p,, with entries 
Pi, q > that give the probability of player i choosing 
room q. These vectors obey the normalization constraint 

How strongly a player specializes in one option can be 
measured using the self-overlap of his strategy vector: 



(9) 



This quantity varies from 1/Q for a completely undecided 
player to 1 for a player who chooses one option exclu- 
sively. The average over the population, O = J2i^i/N, 
is therefore a good measure of the degree of specialization 
among players. 

If a player's score drops below the threshold —d, he 
is replaced by a player with a different strategy vector. 
Again, two different paths suggest themselves: either, as 
in Ref. [7], the player is replaced by an altered copy 
of himself (or, in the extreme case of large mutations, a 
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randomly chosen new player); or the gap is filled by the 
mutated offspring of another, randomly chosen player. 
We ran simulations starting with random initial vectors 
uniformly distributed on the simplex (see the Appendix) . 
The same picture emerges as in the binary-choice case: 

If deceased players are replaced by copies of them- 
selves or players with random strategies, very little spe- 
cialization occurs. The stationary distribution of Oi gets 
slightly more contributions from larger values; however, 
the mean value shifts only little. E.g., for Q = 3 and 
d = 10, the average O changes from 1/2 for no adaption 
to w 0.540 for replacing deceased players with randomly 
chosen strategies. Correspondingly, global efficiency in- 
creases only slightly: in this case, from Uq = 1/3 to 
« 0.242. 

However, copying another player (with small muta- 
tions) gives excellent coordination: in the stationary 
state, all players specialize strongly in one of the choices 
- the self-overlap of strategies is close to 1 (see Fig. 2). 
The width of the probability distribution of Oi again de- 
pends on the magnitude of mutations: the smaller the 
mutations, the narrower the peak. As before, eliminat- 
ing mutations altogether prevents coordination. 
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FIG. 2: Stationary distribution P(0) of the self-overlap of 
strategy vectors for Q = 3 options. The solid line give the 
initial state of vectors chosen uniformly on the simplex. The 
dashed and dotted line give the result of the original dynam- 
ics: it matters little if replaced players undergo small muta- 
tions or are chosen completely at random. The dot-dashed 
line shows the effect of imitation with small mutation. The 
same parameters as in Fig. 1 were used. 



IV. THE STOCHASTIC MG 

In Ref. [8], a strategy was presented that looks sim- 
ilar to the EMG described above: again, each agent is 
equipped with a probability p, that characterizes his be- 



havior. The meaning of p is different, though: if a player 
wins in a given round, he is content and repeats his choice 
<7* in the following round. If he loses, however, the agent 
may rethink his game plan and switch to the opposite ac- 
tion with probability pi . This prescription amounts to a 
one-step Markov process which can be solved analytically 
in some regimes if all players use the same p [8, 20]. 

For large p (of order 1), a finite fraction of the popula- 
tion switches at every time step, resulting in large global 
losses (a 2 = 0{N 2 )). Furthermore, the majority flips 
very frequently. The stationary probability distribution 
of A = - a 2 takes the shape of two roughly Gaussian 
peaks centered at ±Np/ (2 — p).e Pine finished - Clo 

However, if p scales with p = 2x/N, x = 0(1), there is 
very good coordination (a 2 = l + 4a; + 4x 2 /3 as N — ► oo), 
and the minority does not switch at every time step. The 
stationary probability distribution is centered at A = 0, 
with a width of roughly 2x. 

The relative simplicity of the mathematical description 
breaks down if each player is allowed to have an indi- 
vidual pf. if players are distinguishable, it is no longer 
enough to state how many of them are on one side or the 
other to completely characterize the system. However, 
with a few approximations, even then some insight can 
be gained if complications like an evolutionary dynamics 
are introduced. 

We start with a scenario analogous to that described in 
Sec. II, which we will call SEMG (stochastic evolutionary 
Minority Game) : each of the N players has an individual 
probability pi of switching, which is initially a uniform 
random number. In the case of a loss, the player loses a 
point; otherwise he wins one. If his score drops below — d, 
his probability is replaced, and his score reset to 0. One 
would expect that players with smaller p have an advan- 
tage over those with large p, and one would hope that 
players organize themselves to a stationary state with 
Pi oc l/N. 

If mutations of players are performed analogous to the 
original EMG (new players are modified copies of the de- 
ceased ones), a stationary distribution P{p) emerges in 
which small p are more likely than large ones, but there is 
still a significant tail towards large p (see Fig. 3). Simula- 
tions show that contrary to the original EMG, the details 
of the mutation process (size of the mutation, reflecting/ 
cyclical boundary conditions etc.) have no impact at all 
on this distribution. Neither does the threshold d. The 
number of players N has only a small effect - the shape of 
the distribution for larger p does not change significantly, 
but P(p) increases for very small p as N increases. This 
effect is explained later; however, it is much too small to 
achieve a mean p that scales with l/N. 

As mentioned before, p = 0(1) means that the minor- 
ity changes sides at practically every time step. Assum- 
ing that this is true, it is possible to calculate the average 
gain of a player with a given p, and hence his expected 
lifetime. From this, the stationary distribution can be 
calculated. Let us start with a more general approach, 
which assumes very simple dynamics of the global deci- 
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FIG. 3: Stationary distribution P(p) in the Stochastic EMG, 
for different values of N. New players use randomly chosen 
values of p; the threshold is d — 10. 



sion and neglects the impact of a single player on that 
decision. If the global minority stays the same in two 
consecutive time steps with a given probability ^, the 
average gain of a player can be determined as follows: 
with a probability w(p), the player wins in a given round. 
Consequently, he does not switch sides, and wins again 
in the next round with probability fi. However, if he lost 
in the first round, he can win by changing sides if the 
minority stays on the same side (probability p,p) or if he 
insists on his opinion, and the minority switches (proba- 
bility (1 — fi)(l — p)). For a given pi, the probability of 
winning is therefore 

w(p) = fiw(p) + (pp + (i - - p))(i - Mp))- (io) 

For the average gain g(p) = 2w(p) — 1, this gives 

-p(l - 2p) 



l + (l-f>)(l-2 M )- 



(11) 



As long as g(p) is systematically negative and does not 
undergo fluctuations on time scales comparable to the 
lifetime of a player, it is safe to assume that the mean 
lifetime L(p) is d/g(p) - the score is a random walk with 
a negative drift which outweighs the diffusive motion for 
sufficiently large d. 

Assuming that the average p is large enough to ensure 
that the majority will flip at every time step - i.e., p = 0, 
and g(p) = p/(p — 2) - one can now identify —1/L(p) — 
p/d(2—p) with the fitness f(p) and use Eq. (4) to derive, 
for the stationary state, 



P(p) oc 



P 



(12) 



This probability distribution has the unpleasant property 
of diverging at p = 0, since agents with p = are assigned 



an infinite lifetime. In practice, three effects come into 
play: first, agents never have exactly p = 0. Second, even 
for agents with p — 0, their impact on the decision will 
give them a very small negative average gain and a long, 
but not infinitely long life. Third, p is very small, but 
never strictly equal to 0; if the probability of large values 
of p becomes too small, the simplistic assumptions about 
the dynamics no longer hold, and p increases. Together, 
these effects are responsible that for any given set of pa- 
rameters, a stationary distribution emerges. Eq. (12) 
allows for a good fit to these stationary distributions mea- 
sured in simulations, as seen in Fig. 3. 

The average p that emerges from these simulations is 
of order 1 (to be more specific, around 0.10, with the pre- 
cise value depending on N and d). This means that the 
solution is self-consistent: the value of p that results from 
the dynamics is large enough to justify the assumptions 
that went into estimating it. 

Analogous to Sec. II, the evolutionary dynamic of re- 
placing a player with a random player or a copy of the old 
one does not always favor strategies with higher fitness. 
However, the same step can be taken to improve coordi- 
nation: if deceased players are replaced with a copy of 
another player chosen at random, the relative growth of 
P(p) is proportional to f(p) — /, just as in Eq. (6). 

Unfortunately, Eq. (11) is not applicable for very small 
p (since it neglects the impact of the considered agent), 
and there is no simple equation that gives the fitness as 
a function of the strategy for all regimes. Nevertheless, 
there seems to be no situation where having a higher p 
gives better results. Hence, the evolutionary dynamic 
should lead to a state with minimal p for all agents. A 
similar problem as in Sec. II comes into play here, al- 
though it docs not have quite as troubling effects: with 
a finite number of players, the best possible coordination 
is for all players to adapt the smallest value of p that sur- 
vived the initial stage. However, this value is usually of 
order 1/N - if initial values are chosen at random, they 
have an average distance of 1/N. 

Just as in Sec. II, the sensitivity to the initial state can 
be removed by adding a small random number to pi when 
a new player is created. As seen in Fig. 4, results are 
similar: a peak centered at p = emerges, whose width 
depends on the size of the mutations. With sufficiently 
small mutations, p = 0(1/ N) and <r 2 = 0(1) can easily 
be achieved. 

One of the drawbacks of introducing small mutations is 
that their size is a new parameter that has to be adjusted 
to get a a 1 of order 1. One of the conceptual flaws of the 
SMG was that players had to be aware of the size of the 
population to justify an adequate choice of p. One might 
have hoped that in an evolutionary scheme, the correct 
scaling for p would emerge naturally. With the dynamic 
of copying with mutations, it does not. Maybe an evo- 
lutionary mechanism that mutates the size of mutations 
would solve this. 
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FIG. 4: Stationary distribution P(p) in the Stochastic EMG 
with copying. The variance V of mutations influences the 
width of the distribution. The simulation included N = 2000 
players, with a threshold d = 10. 



V. MULTI-CHOICE STOCHASTIC MG 

The stochastic MG can be generalized to multiple 
choices in several ways, none of which is quite as nat- 
ural as the generalization of the EMG in Sec. III. One 
of the most intuitive ways is this: again, players have the 
choice between Q different actions, or "rooms". Those 
who chose the least-crowded room are content and choose 
it again, whereas all others decide, with probability Pi, 
to look for alternatives. If they are not informed about 
which room won, they randomly pick one of the room 
that they did not choose the last time. Another plau- 
sible scenario is that they know which room won, and 
choose it with probability p the next time. 

In both cases, the system can still be considered a one- 
step Markov process. However, the state of the system 
must now be characterized by Q — 1 values N\, . . . , Aq_i, 
which give the number of players that chose each action 
(the remaining value Nq can be calculated from normal- 
ization constraints: ^2 q N q — A), and the joint prob- 
ability distribution is a Q — f-dimcnsional tensor, or a 
function living in R^ -1 if one wants to go to continuous 
variables. Transition probabilities look even worse, tak- 
ing the form of 2{Q — I)-dimcnsional tensors or integral 
kernels. Put briefly, this problem is only accessible to 
simulations and very crude approximations. 

In the limit of p = 0(1), A — > oo, the behavior is 
analogous to that detailed in Sec. IV: finite fractions of 
players move from room to room, and the minority op- 
tion changes at every time step. Suitable variables are 
n q = N q /N, the fractions of players who chose option q. 
Occupation probabilities P{n q ) turn out to be a superpo- 
sition of Q Gaussian peaks whose widths decrease with 
A. Self-consistent values for the centers of the peaks can 



be found analytically, as the following example for Q = 3 
will show: 

At any given step, there are three occupation num- 
bers, which we order m < n 2 < n 3 . Room 1 will now 
receive players from rooms 2 and 3, whereas rooms 2 and 
3 gain players from the respective other room and lose 
players to all other rooms. Neglecting fluctuations, the 
rate equations for the occupations nj at the next time 
step look like this: 



nj = m + (p/2)(n 2 + n 3 ); 

pn 2 + (p/2)n 3 ; 
pn 3 + (p/2)n 2 . 



n 2 ' = n 2 



(13) 



If one can find a permutation of n>i such that each m is 
equal to n+ with some j ^ i, one has a solution. In the 

present case, the solution for Eq. (13) is nf = n 3 , n\ = 
ni and rij = n 2 for < p < 2/3, and = n 3l n\ = 
n 2 and nj = n\ for 2/3 < p < 1. The corresponding 
equations are 

4 - 6p + 3p 2 



n 2 



«3 



3(2 -pY 
4(1 -p) 



for p< 2/3, 1 
2 
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imp > 2/3; 



3(2 -pf 
2 

3(2 -p) 



forp < 2/3, 
forp< 2/3, 



10 -3p 



10 -3p 



10 -3p 

for p > 2/3; 

for p> 2/3. (14) 



This solution agrees well with behavior observed in sim- 
ulations, as Fig. 5 shows. 




FIG. 5: Centers of the peaks of n(n) in the stochastic MG 
with Q = 3. Simulations agree well with Eqs. (14). The inset 
shows the probability distribution for p = 0.3 and = 2000. 

For small uniform p of order Q/N, analytical results 
are hard to find, for the mentioned reasons. Evidence 
from simulations shows that the system organizes itself 
into a probability distribution close to optimal coordi- 
nation. The details of the distribution depend, even for 
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large N, on N mod Q. An example of such a distribution 
is shown in Fig. 6. 
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a game where the goal is to take different actions than 
the majority. One could argue that in the regime of the 
EMG where players have very strong preferences for ei- 
ther side, the EMG is not too different from the Stochas- 
tic MG: players stick to their opinion if possible; if they 
lose too often, they either copy a player from their own 
side (which changes little about the situation), or they 
copy a player from the other side - they switch their 
output. The presence of accumulated scores makes the 
situation more complicated than that, yielding preference 
oscillations whose wavelength depends on the threshold. 

Despite the increased efficiency and robustness that 
the imitation mechanism has brought, the dynamics are 
still complex, and a thorough analytical treatment has 
not been found yet. The same holds for the SEMS, 
where the interplay between the probability distributions 
of outputs and of strategies is difficult to handle ana- 
lytically. Maybe future research will fill in the missing 
details. 



FIG. 6: Stationary probability distribution of occupation 
numbers N q in the multi-choice SMG, with uniform p = 
1/2000 w 1/N, for Q = 3. 

Evolutionary dynamics can be introduced exactly anal- 
ogous to the previous sections, and very similar results 
are observed: choosing a new player at random and re- 
placing a player by a mutated copy of himself yields 
the same stationary probability distribution, which has 
a long tail towards larger p. Average values for p are 
around 0.2, the probability distribution of occupation 
numbers N q shows multiple peaks. 

The alternative dynamic of copying another player 
with mutation gives a sharp peak around p = 0, with 
erg on the order of 1 for sufficiently small mutations. 

VI. CONCLUDING REMARKS 

We have shown that the self-organized segregation ob- 
served in the evolutionary Minority Game is not only 
much more pronounced, but also more robust to modifi- 
cations of the payoff scheme if a suitable dynamic is used 
- one that allows strategies with above-average fitness to 
grow, rather than keeping sub-par strategies alive. Copy- 
ing another player's strategy is a suitable way of doing 
this. The copy has to be modified by a small mutation 
to eliminate sensitivity to initial conditions. 

We have also introduced a natural generalization of 
the EMG to multiple choices, evolutionary dynamics for 
the Stochastic MG, and an extension of the latter to 
multiple choices. The results in all cases have striking 
similarities: properly chosen evolutionary dynamics lead 
to near-optimal coordination and drastic suppression of 
losses, compared to random guessing. 

It is somewhat ironic that the key ingredient to these 
dynamics is copying the strategy of another player - in 
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APPENDIX A: CONSTRUCTING UNIFORMLY 
DISTRIBUTED VECTORS ON A SIMPLEX 

There are many conceivable methods of finding Q- 
dimcnsional random vectors p that obey the constraints 
of probabilities, pi > and J2iPi = 1- However, the easi- 
est ones do not give a uniform distribution on the simplex 
of allowed vectors: for example, forcing a set of uniform 
random numbers between and 1 to obey the constraints 
by dividing them by their sum emphasizes vectors in the 
center of the simplex due to projection effects. 

The following method generates uniformly distributed 
probability vectors from uniformly generated random 
numbers. We include it because it may be useful to the 
reader for other applications. 

The space of allowed vectors p is spanned by linear 
combinations 

p = J2a q b q (Al) 



of a set of Q basis vectors b g 
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with coefficients 1 = a\ > 0,2 > • • • > clq > 0. The com- 
ponents must be chosen with a suitably weighted prob- 
ability distribution to account for the fact that a larger 
coefficient a-i allows for more combinations of a^, 04 etc. 
Since the volume of the sub-simplex limited by a q is pro- 
portional to a®~ q , the appropriate distribution is 



calculated from a set of random numbers {Y2 , . . . , tq } 
uniformly distributed between and 1 by a simple trans- 
formation [21]: 



Oi = l; a q = a q ^r\'^ +l \ 



(A4) 



Prob(a g ) 



Q-g+i a Q-q 

a q -i 1 





for < a q < a q - 
else 



(A3) Eq. (Al) then gives the desired vector on the simplex. 



Consequently, a set of coefficients {02, •••,oq} can be 
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